AITopics | maximum entropy regularization

Collaborating Authors

maximum entropy regularization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Connectionist Temporal Classification with Maximum Entropy Regularization

Neural Information Processing SystemsNov-20-2025, 23:08:44 GMT

Connectionist Temporal Classification (CTC) is an objective function for end-to-end sequence learning, which adopts dynamic programming algorithms to directly learn the mapping between sequences. CTC has shown promising results in many sequence learning applications including speech recognition and scene text recognition. However, CTC tends to produce highly peaky and overconfident distributions, which is a symptom of overfitting. To remedy this, we propose a regularization method based on maximum conditional entropy which penalizes peaky distributions and encourages exploration. We also introduce an entropy-based pruning method to dramatically reduce the number of CTC feasible paths by ruling out unreasonable alignments. Experiments on scene text recognition show that our proposed methods consistently improve over the CTC baseline without the need to adjust training settings.

connectionist temporal classification, maximum entropy regularization, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reviews: Connectionist Temporal Classification with Maximum Entropy Regularization

Neural Information Processing SystemsOct-8-2024, 07:48:52 GMT

This work presents a method for end-to-end sequence learning, and more specifically in the framework of Connectionist Temporal Classification (CTC). The paper has two main contributions: - The first is a regularization of the training of the CTC objective in order to reduce the over-confidence of the model. In order to do that, the authors propose a method based on conditional entropy. More specifically, the proposed regularization would encourages the model to explore paths that are close to the dominant one. In order to do so, they suppose that the consecutive elements of a sequence have equal spacing.

connectionist temporal classification, contribution, maximum entropy regularization, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.40)

Add feedback

FedMAX: Mitigating Activation Divergence for Accurate and Communication-Efficient Federated Learning

Chen, Wei, Bhardwaj, Kartikeya, Marculescu, Radu

arXiv.org Machine LearningApr-7-2020

In this paper, we identify a new phenomenon called activation-divergence which occurs in Federated Learning (FL) due to data heterogeneity (i.e., data being non-IID) across multiple users. Specifically, we argue that the activation vectors in FL can diverge, even if subsets of users share a few common classes with data residing on different devices. To address the activation-divergence issue, we introduce a prior based on the principle of maximum entropy; this prior assumes minimal information about the per-device activation vectors and aims at making the activation vectors of same classes as similar as possible across multiple devices. Our results show that, for both IID and non-IID settings, our proposed approach results in better accuracy (due to the significantly more similar activation vectors across multiple devices), and is more communication-efficient than state-of-the-art approaches in FL. Finally, we illustrate the effectiveness of our approach on a few common benchmarks and two large medical datasets.

activation vector, dataset, fedmax, (11 more...)

arXiv.org Machine Learning

2004.03657

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Virginia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Connectionist Temporal Classification with Maximum Entropy Regularization

Liu, Hu, Jin, Sheng, Zhang, Changshui

Neural Information Processing SystemsFeb-14-2020, 06:42:48 GMT

connectionist temporal classification, maximum entropy regularization, sequence, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.40)

Add feedback

Incremental Learning with Maximum Entropy Regularization: Rethinking Forgetting and Intransigence

Kim, Dahyun, Bae, Jihwan, Jo, Yeonsik, Choi, Jonghyun

arXiv.org Machine LearningFeb-2-2019

Incremental learning suffers from two challenging problems; forgetting of old knowledge and intransigence on learning new knowledge. Prediction by the model incrementally learned with a subset of the dataset are thus uncertain and the uncertainty accumulates through the tasks by knowledge transfer. To prevent overfitting to the uncertain knowledge, we propose to penalize confident fitting to the uncertain knowledge by the Maximum Entropy Regularizer (MER). Additionally, to reduce class imbalance and induce a self-paced curriculum on new classes, we exclude a few samples from the new classes in every mini-batch, which we call DropOut Sampling (DOS). We further rethink evaluation metrics for forgetting and intransigence in incremental learning by tracking each sample's confusion at the transition of a task since the existing metrics that compute the difference in accuracy are often misleading. We show that the proposed method, named 'MEDIC', outperforms the state-of-the-art incremental learning algorithms in accuracy, forgetting, and intransigence measured by both the existing and the proposed metrics by a large margin in extensive empirical validations on CIFAR100 and a popular subset of ImageNet dataset (TinyImageNet).

configuration, incremental learning, intransigence, (15 more...)

arXiv.org Machine Learning

1902.00829

Country: Asia > South Korea > Gwangju > Gwangju (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.61)

Add feedback